bold italics
above this line, to create a floating table of contents html_document: toc:yes toc_float: true
Load packages
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.4.2
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Warning: package 'readr' was built under R version 3.4.2
## Warning: package 'purrr' was built under R version 3.4.2
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag(): dplyr, stats
library(plotly)
## Warning: package 'plotly' was built under R version 3.4.2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Load the data
download.file("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder-FiveYearData.csv", destfile = "C:\\Users\\Katrina\\Desktop\\data\\gapminder-FiveYearData.csv")
gapminder <- read.csv("C:\\Users\\Katrina\\Desktop\\data\\gapminder-FiveYearData.csv")
What does the data look like?
head(gapminder)
## country year pop continent lifeExp gdpPercap
## 1 Afghanistan 1952 8425333 Asia 28.801 779.4453
## 2 Afghanistan 1957 9240934 Asia 30.332 820.8530
## 3 Afghanistan 1962 10267083 Asia 31.997 853.1007
## 4 Afghanistan 1967 11537966 Asia 34.020 836.1971
## 5 Afghanistan 1972 13079460 Asia 36.088 739.9811
## 6 Afghanistan 1977 14880372 Asia 38.438 786.1134
What is the life exp per continent?
p<-ggplot(data=gapminder[gapminder$continent=="Americas",],aes(x=gdpPercap,y=lifeExp,color=continent))+
geom_point()+
scale_x_log10()+
geom_smooth(method="lm")+ # adds a line per continent
facet_wrap(~country)
p
Same plot, interactive
ggplotly(p)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
If you are repeating yourself in your code, you may be able to solve that problem by making your own function!
dplyrYou will likely want to get subsections of your dataframe and/or calculate means of a variable for a certain subsection, dplyr is your friend!
tidyrR likes to have ‘long’ format data where every row is an observation and you have a single column for ‘observations’ the others serve to identify that observation. (exceptions apply when you have multiple types of observations) To switch back and forth from ‘wide’ (how we typically enter data in a spreadsheet) to ‘long’ use tidyr
This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
plot(cars)
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).